Semantic dictionary viewed as a lexical database
نویسندگان
چکیده
In this paper an expert system is described which is ealle(l Lex icographe r and which aims at supplying the user with diverse information about lhlssian words, including bibliographic information concerning individual lex.. ieM items. It is SUl)posed that tim system may be of use for a practical contputationa| linguist ,iLnd at the same time will serw~ as nn instrument of linguistic research. the user with diverse inform~tion about t/.m; slan words, of. [2]. The system is conceived ;~s an aid both in the area of natural language t)ro(:essing and in the traditi(mal lexicogr~qflly. The system consists o[ two I)asi(: colnpollellts: ],cxi(:on (containing ~'~ome 13.000 most corn[non words); l|ibliograt)hical (1;md)ase. It is tim l,exicon that is of prim;~ry c<)ncern in this l~al)cr. L e x i e a l d a t a b a s e a n d i l ,s The idea was topresent the l,exicon in :~ form a d v a n t a g e s o v e r t r a d i t i o n a l of a lexical dat, aha:;e ( I A ) B ) . d i c t i o n a r i e s [,l)t~ is a vo(:abuiary presented in ;~ machine rea(l~ble form and consisting of sever;d do mMnes, ;ts in a usuM relational databmse. 'l'}te In this paper we investigate general principles implemented in nn expert systeln (cMled user may get information ahout morphology, L E X I C O G R A P H E R ) , designed to SUl)ply synl,~ctic combimd)ility and semantit: l'eatnres ACIT~ DE COLING-92, NANqES, 23-28 AOt~rr 1992 1 2 9 5 1)ROe. O1: COLING-92, NAN-n!S, Auc;. 23-28, 1992 of individual lexical items. It is semantics that we concentrate upon in this paper. Many attempts have been made to use traditional dictionaries in order to assign word senses to general semantic categories, cf. [1]. Our LDB contains semantic information that cannot be elicited from the existing dictionaries. The priority is given to semantic features influencing lexieal or grammatical cooccurrence. In this paper possibilities are discussed of predicting selection~l restrictions, syntactic features and other formal characteristics of the utterance such as the array of arguments and their semantic interpretation, the meaning of an aspeetual form of a verb etc., on the basis of semantic features of a word in the lexicon. The main advantage of a lexicai database as compared with a traditional dictionary consists in the fact that a database makes it possible to present semantic information in a format enabling the computer to locate efficiently various types of information specified for a given class of words. To put it differently, the main advantage of a database consists in the possibility of compiling lists of words possessing a common feature or a set of features. There are three main principles that the system is based upon. 1. We are convinced that semantic features of words determine co-occurence to a much greater extent than it is usually acknowledged. In other words, we claim that many aspects of syntactic subcategorization of lexical items are predictable from their meaning. 2. A semantic feature of a word is essentially a semantic component (or components) in its lexieographic definition. 3. A great amount of information about the meaning of a lexical unit; about its combinatory possibilities; prosody; referential features; or about its regular ambiguity, need not he stored in the dictionary: this information belongs to wi~at may be called a g r a m m a r of lexicon and should be formulated in a generalized form. In this form it can be stored in a Lexical Knowledge -Base of semantic and syntactic regularities. This KnowledgeBase has not yet been designed, but semantic features of words in LDB are conceived as an input for general rules that will be stored in this hypothetical Knowledge-Base. 2 L e x i c a l D a t a b a s e for C o n c r e t e N o u n s There are different layers of lexicon that require specific formats of a database, and the choice of the format is one of the main problems of database formation. In what follows we list domains in the Lexical Database for Concrete Nouns one of the components of Lexicographer, now implemented in a working program~ Each domain is interpreted as a feature that can take a definite set of values. Domain I. Morphological and syntacticoACTES DE COLING-92, NANTEs, 23-28 AOt~T 1992 ! 2 9 6 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 morphological information (taken front the graminatic~d dictionary [3]). '].'his domain is subdivided into thre, e domains: 1.1. Gender (fern., nlasc., neuter., comnlon). 1.2. Animate/Inanimate 1.3. ])eclension and accentuation. All the other domains contain semantic in formation. We do not mean that the system of semantic features wouhl provide a word with an exhaustive texicographic delinition this is not the appropriate task for a lexical database. Tire purpose of a database is to highlight those semantic aspects of a word that unite semanti eally cognate words and differentiate many of semantically different words from one another. In othe.r words, lexical database is an instrument of predicting and calculating all sorts of usefifl semantic classes of words. Domains ILl and II.2 specify MereologieM status of a word (more precisely, of a lexeme namely, of a word taken in one of its lexical meanings). The wdues of the feature I1.1 may be: PART, SET or WII()LE. In the later case dommn 11.2 is emtrty while in the tlrst two cases it specifies the WIIOLE for the PART and the ELEMENT for the SI';T: I'ART (SI';T) of what? E.g.,
منابع مشابه
Syntagma Lexical Database
This paper discusses the structure of Syntagma's Lexical Database (focused on Italian). The basic database consists in four tables. Table Forms contains word inflections, used by the POS-tagger for the identification of input-words. Forms is related to Lemma. Table Lemma stores all kinds of grammatical features of words, word-level semantic data and restrictions. In the table Meanings meaning-r...
متن کاملA Bilingual Electronic Dictionary for Frame Semantics
Frame semantics is a linguistic theory which is currently gaining ground. The creation of lexical entries for a large number of words presupposes the development of complex lexical acquisition techniques in order to identify the vocabulary for describing the elements of a 'frame'. In this paper, we show how a lexical-semantic database compiled on the basis of a bilingual (English-French) dictio...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملLINGUISTIC DESCRIPTION IN DICTIONARIES: SEMANTICS Extraction of semantic relations from a Basque monolingual dictionary using Constraint Grammar
This paper deals with the exploitation of dictionaries for the semi-automatic construction of lexicons and lexical knowledge bases. The final goal of our research is to enrich the Basque Lexical Database with semantic information such as senses, definitions, semantic relations, etc., extracted from a Basque monolingual dictionary. The work here presented focuses on the extraction of the semanti...
متن کاملExtraction of semantic relations from a Basque monolingual dictionary using Constraint Grammar
This paper deals with the exploitation of dictionaries for the semi-automatic construction of lexicons and lexical knowledge bases. The final goal of our research is to enrich the Basque Lexical Database with semantic information such as senses, definitions, semantic relations, etc., extracted from a Basque monolingual dictionary. The work here presented focuses on the extraction of the semanti...
متن کاملSemantic Modeling in Morpheme-based Lexica for Greek
A Machine Readable Dictionary (MRD or Lexicon) can be designed as a large-scale lexical database, having the task of supporting many different applications such as morphological, syntactic and semantic processing, information retrieval, machine translation, educational tools, etc. Regardless of how different these applications may be, they need a comprehensive lexical database to rely on, since...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992